Skip to content

feat: Docker Compose local test infrastructure with seed data#24

Merged
dtsong merged 5 commits intomasterfrom
feat/docker-local-test-infrastructure
Mar 2, 2026
Merged

feat: Docker Compose local test infrastructure with seed data#24
dtsong merged 5 commits intomasterfrom
feat/docker-local-test-infrastructure

Conversation

@dtsong
Copy link
Owner

@dtsong dtsong commented Mar 2, 2026

Summary

  • Seed data: Add SQL init scripts for PostgreSQL and MySQL (~1000 rows each) with deliberate diffs (5 deletes, 5 inserts, 10 updates) so developers can immediately demo data-diff against local containers
  • Docker profiles: Move ClickHouse, Presto, Trino, and Vertica behind profiles: [full] so docker compose up only starts PostgreSQL + MySQL (fast <10s startup)
  • Default connection strings: All docker-compose databases now have defaults in tests/common.py — no manual env vars needed
  • Makefile: Developer ergonomics with make up, make up-full, make down, make test, make test-unit, make demo
  • CI cleanup: Remove redundant DATADIFF_CLICKHOUSE_URI env override (now defaulted in code)
  • Bug fixes: Validate column types in set operations and fix PostgreSQL timestamp edge cases

Test plan

  • make down && make up — verify only PG + MySQL start
  • docker exec dd-postgresql psql -U postgres -c "SELECT count(*) FROM ratings_source" — should return 1000
  • docker exec dd-mysql mysql -umysql -pPassword1 mysql -e "SELECT count(*) FROM ratings_source" — should return 1000
  • make demo — runs data-diff against seed tables, shows diffs
  • make test-unit — unit tests pass without any database
  • make test — full suite passes against PG + MySQL
  • docker compose --profile full up -d --wait — all 6 services start
  • CI passes (ClickHouse URI no longer needed as env override)

🤖 Generated with Claude Code

dtsong and others added 2 commits March 2, 2026 01:27
…tamp edge cases

Add type-class validation to TableOp.schema and TableOp.type so that
unions, intersects, and minus operations reject mismatched column types
early with a clear QueryBuilderError instead of silently producing
incorrect results. (#5)

Fix PostgreSQL timestamp normalization: use timestamptz(6) cast for
TimestampTZ columns to preserve timezone info during bounds comparison,
and replace hardcoded TIMESTAMP_PRECISION_POS with length()-based
calculation to correctly pad years with >4 digits. (#12)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add SQL seed data (PostgreSQL + MySQL) with ~1000 rows and deliberate
diffs for showcasing data-diff. Default connection strings for all
docker-compose databases, add profiles to keep lightweight default
(PG + MySQL only), and add Makefile for developer ergonomics.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d0021615d9

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

tests/common.py Outdated
Comment on lines +35 to +36
TEST_VERTICA_CONN_STRING: str = (
os.environ.get("DATADIFF_VERTICA_URI") or "vertica://vertica:Password1@localhost:5433/vertica"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Keep Vertica tests opt-in by default

This new default URI makes Vertica always appear configured, so tests/common.py includes Vertica in CONN_STRINGS and parameterized integration tests will try to connect to it even when that service is not running; I checked .github/workflows/ci.yml and it only starts mysql postgres presto trino clickhouse, and Makefile test only starts postgres mysql, so both default workflows can now fail with connection errors unless users manually override the env var to disable Vertica.

Useful? React with 👍 / 👎.

dtsong and others added 3 commits March 2, 2026 01:49
Critical:
- Fix _add_padding double-truncation regression for rounding branch
  (split into _truncate_and_pad and _zero_pad for correct behavior)

Important:
- Fix non-rounding timestamp path to use timestamptz cast for TimestampTZ
- Add None guard to TableOp.type to avoid misleading errors
- Use QueryBuilderError consistently for schema length mismatch
- Revert Presto/Trino/Vertica conn defaults to None (CI doesn't test them)
- Remove unused Presto/Trino from CI docker compose command
- Add comprehensive tests for all timestamp paths and edge cases

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Revert ClickHouse default conn string to None so `make test` skips
  ClickHouse when the container isn't running; set URI explicitly in CI
- Add None-schema guard in TableOp.schema with clear error message
- Return None (not optimistic type) when one side of TableOp.type is unknown
- Fix Makefile comment to accurately reflect PG + MySQL (not ClickHouse)
- Add comment explaining why Join.schema skips cross-table type validation
- Add tests for TableOp.type mismatch and matching branches

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add comment explaining why --profile full is needed (ClickHouse is
  profile-gated; only explicitly named services start)
- Add `or None` to Databricks and MsSQL conn strings to handle empty
  env vars consistently with all other optional databases

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@dtsong dtsong merged commit 7c9e4ad into master Mar 2, 2026
6 checks passed
@dtsong dtsong deleted the feat/docker-local-test-infrastructure branch March 2, 2026 10:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant